Prompt-based for Low-Resource Tibetan Text Classification

نویسندگان

چکیده

Text classification is a critical and foundational task in Tibetan natural language processing, it plays crucial role various applications, such as sentiment analysis information extraction. However, the limited availability of annotated data poses significant challenge to processing. This paper proposes prompt learning-based method for low-resource text overcome this challenge. utilizes pre-trained models learn representation generation capabilities on large-scale unsupervised corpus, enabling few-shot classification. Experimental results demonstrate that proposed significantly improves performance scenarios. work provides new research idea Hopefully, will inspire subsequent

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Readability Classification of Textbooks of a Low-Resource Language

There are many languages considered to be low-density languages, either because the population speaking the language is not very large, or because insufficient digitized text material is available in the language even though millions of people speak the language. Bangla is one of the latter ones. Readability classification is an important Natural Language Processing (NLP) application that can b...

متن کامل

Tibetan Text Clustering Based on Machine Learning

Tibetan information processing technology has been obtained some achievements. But it falls behind Chinese and English information processing. It still needs to be paid more attention. Text clustering has the potential to accelerate the development of Tibetan information processing. In this paper, we propose an approach of Tibetan text clustering based on machine learning. Firstly, the approach...

متن کامل

HR-CTC: A Large Human Resource Corpus for Text Classification

The absence of an appropriate text classification corpus makes the massive amount of online job information unusable for labor market analysis. This paper presents JCTC, a large job posting corpus for text classification. In JCTC construction framework, a formal specification issued by the Chinese central government is chosen as the classification standard. The unsupervised learning (WE-cos), s...

متن کامل

Low-Resource Speech-to-Text Translation

Speech-to-text translation has many potential applications for low-resource languages, but the typical approach of cascading speech recognition with machine translation is often impossible, since the transcripts needed to train a speech recognizer are usually not available for low-resource languages. Recent work has found that neural encoder-decoder models can learn to directly translate foreig...

متن کامل

An Algorithm for Resource Allocation through the Classification of DMUs

  Data envelopment analysis (DEA) is a non-parametric method for assessing relative efficiency of decision-making units (DMUs). Every single decision-maker with the use of inputs produces outputs. These decision-making units will be defined by the production possibility set. Resource allocation to DMUs is one of the concerns of managers since managers can employ the results of this process to a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing

سال: 2023

ISSN: ['2375-4699', '2375-4702']

DOI: https://doi.org/10.1145/3603168